DFS-based frequent graph pattern extraction to characterize the content of RDF Triple Stores

نویسندگان

  • Adrien Basse
  • Fabien Gandon
  • Isabelle Mirbel
  • Moussa Lo
چکیده

Semantic web applications often access distributed triple stores relying on different ontologies and maintaining bases of RDF annotations about different domains. Use cases often involve queries which results combine pieces of annotations distributed over several bases maintained on different servers. In this context, one key issue is to characterize the content of RDF bases to be able to identify their potential contributions to the processing of a query. In this paper we propose an algorithm to extract a compact representation of the content of an RDF repository. We first improve the canonical representation of RDF graphs based on DFS code proposed in the literature. We then provide a join operator to significantly reduce the number of frequent graph patterns generated from the analysis of the content of the base, and we reduce the index size by keeping only the graph patterns with maximal coverage. Our algorithm has been tested on different data sets as discussed in conclusion.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Incremental characterization of RDF Triple Stores

Many semantic web applications integrate data from distributed triple stores and to be efficient, they need to know what kind of content each triple store holds in order to assess if it can contribute to its queries. We present an algorithm to build indexes summarizing the content of triple stores. We extended Depth-First Search coding to provide a canonical representation of RDF graphs and we ...

متن کامل

Context-aware access control and presentation of linked data. (Contrôle d'accès et présentation contextuelle pour le Web des données)

This thesis discusses the influence of mobile context awareness on Web of Data access from handheld devices. The work dissects this issue into three research questions: how to declaratively describe context by complying with Linked Data best practices, how to enable context-aware adaptation for Linked Data consumption, and how to protect access to RDF stores from context-aware devices. The firs...

متن کامل

Benchmarks for SPARQL Property Paths Bachelorarbeit

The Resource Description Framework (RDF) is a triple based representation of directed graphs with labelled edges. With the emergence of RDF graphs special databases, called RDF stores, were developed. In order to query graphs, which are stored in these RDF stores, the query language SPARQL Protocol And Query Language (SPARQL) is used. With the help of this query language it is possible to descr...

متن کامل

SPARQL Update under RDFS Entailment in Fully Materialized and Redundancy-Free Triple Stores

Processing the dynamic evolution of RDF stores has recently been standardized in the SPARQL 1.1 Update specification. However, computing answers entailed by ontologies in triple stores is usually treated orthogonal to updates. Even the W3C’s recent SPARQL 1.1 Update language and SPARQL 1.1 Entailment Regimes specifications explicitly exclude a standard behavior how SPARQL endpoints should treat...

متن کامل

A Distributed Process Infrastructure for a Distributed Data Structure

The Resource Description Framework (RDF) is continuing to grow outside the bounds of its initial function as a metadata framework and into the domain of general-purpose data modeling. This expansion has been facilitated by the continued increase in the capacity and speed of RDF database repositories known as triple-stores. High-end RDF triple-stores can hold and process on the order of 10 billi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010